Our main objectives are to handle some of the common data problems encountered with marine fastloc GPS data and diving data, using a narwhal dataset provided by Dr. Marianne Marcoux. Specifically, we aim to address the following:
Irregular locations
Time gaps
Including diving data-streams
First, we’ll setup the workspace with required packages.
library(momentuHMM)
library(dplyr)
library(tidyr)
library(lubridate)
library(adehabitatLT)
library(sf)
library(tmap)
library(terra)
library(units)
library(stringr)
library(diveMove)
library(conicfit)
library(car)
library(mitools)
library(doFuture)
library(corrplot) # Pearson's correlation matrix using cor()
library(data.table)
library(earthtide)
library(knitr)
library(terra)Please also set working directory to “Day2” of the HMM workshop:
For simplicity, we will also only look at the data for the month of August, 2017.
The data we obtain is often quite messy with records missing
information and other records duplicated. we can filter only records
with complete location data using !is.na(x) & !is.na(y)
and remove duplicate records (same time, location, and location class)
using the the lag function from dplyr which
will use the value from the previous row.
tracks <- tracks %>%
# remove missing locations
filter(!is.na(x) & !is.na(y),
# remove identical records
!(time == lag(time) & x == lag(x) & y == lag(y) & loc_class == lag(loc_class)))Next, we’ll convert the data to a spatial dataset using the
sf package and plot the data. First, we define the
coordinate reference system of the original data (in this case WGS84,
which is defined by the EPSG code 4326). Next, we will
project the data into NAD83(CSRS) UTM zone 21N (EPSG:2962), which will
projected the coordinates in meter units with minimal distortion for
this data set.
tracks_proj <- tracks %>%
st_as_sf(coords = c("x", "y")) %>% # converts to an sf object
st_set_crs(4326) %>% # define CRS
st_transform(2962) # reproject data to a UTMFor the first part of this tutorial, we’ll use only the fastloc GPS data so we don’t have to deal with location error.
We lose some data, particularly near the end of the tracks, but we will integrate ARGOS locations later in this tutorial.
Now, we can map the data using the tmap package to
visualize what it looks like.
# plot GPS
tracks_gps %>%
group_by(ID) %>%
summarise(do_union = FALSE) %>%
st_cast("LINESTRING") %>%
tm_shape() +
tm_lines(col = "ID", palette = "Dark2")The classic HMM assumes the observation data is in discrete time and that there is no missing data in the predictor variables. There are two key decisions we must make, the temporal resolution to use, and how to address data gaps. The desired resolution depends predominantly on the biological question you are asking as different behaviours and biological processes occur at different spatial and temporal scales (e.g. seasonal migration, daily movement between foraging and resting grounds, and fine scale foraging decisions). Generally, higher resolution data is preferred as it has more information, however it is possible to have too-high of a resolution wherein information from fine-scale variability drowns out the signal from coarse-scale patterns of interest (e.g., seasonal migration). In this case, we will be linking the movement data with high resolution (75 s) dive data to look at finer-scale behaviours (on the order of a few hours). My rule of thumb, is that you want 3-50 data points per behaviour. For behaviours spanning several hours, that roughly corresponds to a desired resolution between 2 min and 60 min.
First, let’s calculate the time difference between successive records
using difftime and lead (compares current row
to following row). For the last record of each individual (i.e., when
ID != lead(ID)), we will set to NA.
# Calculate time difference between locations
tracks_gps <- tracks_gps %>%
mutate(dt = ifelse(ID == lead(ID), # If next data row is same individual
difftime(lead(time), time, units = "mins"), NA
)) # calculate time differenceLet’s see what resolutions may be possible in the data by looking at the most frequent time gaps.
# Zoom in on short intervals
hist(tracks_gps$dt, 1000, main = NA, xlab = "Time difference (min)", xlim = c(0,100))##
## 10 11 12 22 9 13
## 400 145 90 64 63 63
We see that the most frequent time gap is 10 min, followed by 11, 12, 22, 9, and 13 min. We also see the majority of the gaps are < 60 min, however some are in excess of 600 min. Because HMMs assume there are no missing records, finer resolutions will more gaps that would need to be interpolated. Frequent and large data gaps can be difficult to handle, especially as the number of missing data points approaches or exceeds the existing data; we really want to avoid this. Let’s examine the potential data structure at different resolutions for the different animals.
We first create a function that can approximates the proportion of missing locations we would have for a given resolution.
# Make function to estimate proportion of missing location
p_na <- function(time, n_loc, res) {
time <- round_date(time, paste(res,"min")) # round to nearest resolution
time <- na.omit(time[time != lead(time)]) # remove duplicate time
# calculate maximum number of locations
max_n_loc <- length(seq(time[1], tail(time, 1) + res*60,
by = as.difftime(res, units = "mins")))
n_NA <- max_n_loc - (length(time)+1)
n_NA / max_n_loc
}We can now use this function to look at the proportion of NAs we would get with 1, 2, 5, 10, and 20 min resolution.
# summarise track dt
tracks_gps %>%
st_drop_geometry() %>%
group_by(ID) %>%
summarise(p_NA_10m = p_na(time, n_loc, 10), # 10 min
p_NA_20m = p_na(time, n_loc, 20), # 20 min
p_NA_30m = p_na(time, n_loc, 30), # 30 min
p_NA_60m = p_na(time, n_loc, 60)) # 60 min ## # A tibble: 3 × 5
## ID p_NA_10m p_NA_20m p_NA_30m p_NA_60m
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 T172062 0.486 0.236 0.191 0.152
## 2 T172064 0.502 0.203 0.120 0.0736
## 3 T172066 0.488 0.230 0.185 0.160
Here we see that the 10 min interval, around 50% of the locations would be missing. This is the limit that I would be comfortable at, since at finer resolutions, simulated data would outweigh real data, and may bias the results. Very large data gaps that contribute to much of the missing locations can be excluded from the analysis, therefore, for this tutorial, I will use a 10 min resolution.
There are several ways to deal with data gaps, and I will address four 1. Interpolation (linear and statistical) 2. Voiding data gaps 3. Path segmentation 4. Multiple imputation
Convert tracks
# convert tracks back to data.frame with xy coordinates
tracks_gps_ls <- tracks_gps %>%
mutate(x = st_coordinates(tracks_gps)[,"X"],
y = st_coordinates(tracks_gps)[,"Y"]) %>%
st_drop_geometry() %>%
split(.,.$ID) # split into listFor large datasets with few and small gaps, the simplest approach to use linear interpolation. First, let’s identify the most likely minute we have data.
# which minute has the most data
tracks_gps %>%
st_drop_geometry() %>% # convert back to data.frame
group_by(ID) %>%
summarise(minute = str_sub(minute(time), -1)) %>%
table()## `summarise()` has grouped output by 'ID'. You can override using the `.groups`
## argument.
## minute
## ID 0 1 2 3 4 5 6 7 8 9
## T172062 80 78 62 52 37 40 40 45 37 32
## T172064 77 68 71 67 32 32 40 32 31 38
## T172066 138 52 73 48 44 42 37 27 22 18
It looks like for all three tracks, the most amount of locations fall on 0 min (i.e., 10, 20, 30, 40, 50, or 60 min). Next, for each track, we will create a vector of times in which to estimate locations.
# create full time series on which to estimate locations rounded to the nearest 10 min
tracks_gps_ls_time <- tracks_gps %>%
st_drop_geometry() %>% # convert to data.frame
group_by(ID) %>%
summarise(time = seq(round_date(first(time), "10 min"),
round_date(last(time), "10 min"),
by = 60*10)) %>%
split(.,.$ID) # split into list## `summarise()` has grouped output by 'ID'. You can override using the `.groups`
## argument.
Now, we can interpolate the locations for each track.
# function to create a data frame with approximated locations
approx_locs <- function(tracks, times){
data.frame(ID = times$ID,
time = times$time,
x = approx(tracks$time, tracks$x,
xout = times$time)$y,
y = approx(tracks$time, tracks$y,
xout = times$time)$y)
}
# Interpolate the location at the times from the sequence
tracks_gps_linear <- mapply(approx_locs, tracks_gps_ls, tracks_gps_ls_time,
SIMPLIFY = F) %>%
do.call("rbind", .) # convert list of tracks back to a single data.frame
# remove row names added by rbind
rownames(tracks_gps_linear) <- NULL
# plot locations
plot(tracks_gps_linear$x, tracks_gps_linear$y, pch = 20, col = "red", xlab = "x", ylab = "y", asp = 1)
points(st_coordinates(tracks_gps)[,"X"], st_coordinates(tracks_gps)[,"Y"], pch = 20)Looks like it works. Let’s try fitting an HMM to this. First, lets
prepare the data using prepData and plot the data to
estimate starting parameters.
Note how you can see the sections where we linearly interpolated
locations as a horizontal segment in step-length
# define distribution to use for each data stream
dist <- list(step = "gamma", angle = "vm")
# Setting up the starting values
mu0 <- c(50, 500) # Mean step length
sigma0 <- mu0 # Sd of the step length
kappa0 <- c(0.1, 1) # Turning angle concentration parameter (kappa > 0)
# combine starting parameters
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)Ok, were are ready. Let’s fit the HMM
set.seed(1)
# Fit a 2 state HMM
HMM_gps_linear <- fitHMM(prep_gps_linear, nbState = 2, dist = dist, Par0 = Par0)
plot(HMM_gps_linear, ask = FALSE)## Decoding state sequence... DONE
## Computing pseudo-residuals...
The plotted states look relatively ok, however there is quite a weird flat section in the QQ-plot of the turning angle. In data with large linearly-interpolated gaps, the model often represents the original dynamic data as one state, and defines the straight interpolated segments with a second state. This is a common problem when data gaps are frequent or large such that the information in the interpolated data outweighs the signal from the original observations. Generally, I would only use linear interpolation when data gaps are small (< 3 locations) or relatively infrequent (< 20 % of the modelled locations). In our data, some gaps are > 5 hours (30 locations) and > 50% of the modelled locations are interpolated. So we need to use another approach.The flat section in the QQ-plot around 0 emerges because the model overestimates the values just below the middle quantile (in our case, \(0^\circ\)) and just above the middle quantile. In other words, our model is not capturing the large amount of turning angles really close to \(0^\circ\) (this may be considered a good thing in our case since we know these are linearly interpolated).
A slightly better way to interpolate locations is to fit a
continuous-time correlated random walk (CTCRW) model to the data and
predict the most likely locations. momentuHMM contains
wrapper functions to interpolate missing locations by fitting a CTCRW to
the data based on the crawl package by Devin Johnson and
Josh London. There are many options to fit the CTCRW model, and a
detailed tutorial for analysis with crawl is available
here: propperly https://jmlondon.github.io/crawl-workshop/crawl-practical.html.
Let’s try to fit the most basic model using the wrapper function
crawlWrap.
set.seed(5) # crawl can fail to fit periodically, so I recommend always setting a seed
# fit crawl
crw_gps_10 <- crawlWrap(obsData = tracks_gps, timeStep = "10 mins")## Fitting 3 track(s) using crawl::crwMLE...
## DONE
##
## Predicting locations (and uncertainty) at 10 mins time steps for 3 track(s) using crawl::crwPredict... DONE
# view that all parameters were propperly estimated
crw_gps_10$crwFits %>%
lapply(function(x){
x[c("par","se","ci")] %>% # get estimated values
unlist() %>% # unlist
is.nan() %>% # identify values that failed to estimate
sum() == 0 # count failed estimates and ensure there are 0
}) ## $T172062
## [1] TRUE
##
## $T172064
## [1] TRUE
##
## $T172066
## [1] TRUE
Notice how the predicted tracks do not make perfectly straight lines through missing sections (particularly noticeable in T172062). Next, we will extract the predicted locations and add them to the observed data.
# filter predicted locations
tracks_gps_crw <- data.frame(crw_gps_10$crwPredict) %>%
filter(locType == "p") %>%
dplyr::select(mu.x, mu.y, time,
ID) %>%
dplyr::rename(x = "mu.x", y = "mu.y")Now, let’s try to fit the same HMM as before on this data using the same starting parameters.
set.seed(1)
# prep data
prep_gps_crw <- prepData(tracks_gps_crw, type = "UTM")
# Setting up the starting values
mu0 <- c(50, 500) # Mean step length
sigma0 <- c(50, 500) # Sd of the step length
kappa0 <- c(0.1, 1) # Turning angle concentration parameter (kappa > 0)
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)
# Fit a 2 state HMM
HMM_gps_crw <- fitHMM(prep_gps_crw, nbState = 2, dist = dist, Par0 = Par0)
plot(HMM_gps_crw, ask = FALSE)## Decoding state sequence... DONE
That’s looking much better. It looks like state 1 represents a low-speed, high tortuosity resident state, while state 2 represents higher-speed, low tortuosity travelling state.
In many instances, this model may be sufficient. However, the significant proportion of interpolated locations used to fit the model is likely to affect our results. For example, the large interpolated gaps are still relatively straightened out and a very consistent speed, and may skew the definition of state 2 in particular. Simply interpreting the results with this issue in mind can be adequate when we only have few interpolated locations. Below, we will explore more formal ways to address this problem, which can be particularly useful when we have very large numbers of interpolated locations.
One strategy to address large data gaps is to void the data streams
(i.e., step length and turning angle) during moderate/large interpolated
gaps where we expect that the estimated movement has is largely
independent of the observed data before or after the gap. The maximum
size of a gap to allow depends on the frequency of the missing data,
frequency of locations, study species, and behaviours of interest. In
this case, I will void the data streams from locations predicting in
gaps \(>60\) min. First, we will
identify which steps need to be voided, then we will prepare the data
and void the estimated step and angle data
streams. We will do this again later in the tutorial, so we will wrap it
into a function called prepData_NAGaps.
# define function to replace step and turn angle of large gaps with NA
NA_gaps <- function(tracks, times){
# rows where location is within a large gap
rows <- which(
rowSums(apply(times, 1, function(X, tracks){
dplyr::between(tracks,
as.POSIXct(X[1], tz = "UTC"),
as.POSIXct(X[2], tz = "UTC"))
}, tracks$time))==1)
tracks$step[rows] <- NA
tracks$angle[rows] <- NA
return(tracks)
}
# define function to identify and nullify gaps
prepData_NAGaps <- function(track_list, tracks_crw, res, max_gap, ...){
# for each ID, identify which rows have gaps >= max_gap
gaps_ls_rows <- lapply(track_list, function(x){
which(difftime(lead(x$time), x$time, units = "mins") >= max_gap)
})
# create sequence of times for each track from gaps >= 60 min
gap_ls <- mapply(FUN = function(track, gaps){
# identify start and end date of each gap
gaps_ls_srt_end <- list(start = ceiling_date(track$time[gaps], paste(res, "min")),
end = floor_date(track$time[gaps+1], paste(res, "min")))
# combine into single vector for each track
data.frame(start = gaps_ls_srt_end$start, end = gaps_ls_srt_end$end)
},
track_list, gaps_ls_rows, SIMPLIFY = F)
# prep data and list by ID
prep_tracks <- prepData(tracks_crw, ...) %>%
{split(., .$ID)}
# Interpolate the location at the times from the sequence
mapply(FUN = NA_gaps, prep_tracks, gap_ls,
SIMPLIFY = F) %>%
do.call("rbind", .) # convert list of tracks back to a single data.frame
}
prep_tracks_gps_crw_NAgaps <- prepData_NAGaps(tracks_gps_ls, tracks_gps_crw, 10, 60, type = "UTM")Now, let’s try to fit the same HMM as above to this data with large gaps voided.
set.seed(1)
# Setting up the starting values
mu0 <- c(50, 500) # Mean step length
sigma0 <- c(50, 500) # Sd of the step length
kappa0 <- c(0.1, 1) # Turning angle concentration parameter (kappa > 0)
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)
# Fit a 2 state HMM
HMM_gps_crw_NAgaps <- fitHMM(prep_tracks_gps_crw_NAgaps, nbState = 2, dist = dist, Par0 = Par0)
plot(HMM_gps_crw_NAgaps, ask = FALSE)## Decoding state sequence... DONE
Visually, the difference is subtle.
## $step
## state 1 state 2
## mean 258.0970 705.1489
## sd 158.4498 255.9402
##
## $angle
## state 1 state 2
## mean 0.000000 0.00000
## concentration 2.703572 23.61065
## $step
## state 1 state 2
## mean 282.3696 757.5687
## sd 156.4446 248.9912
##
## $angle
## state 1 state 2
## mean 0.000000 0.000
## concentration 2.473555 19.035
However, the estimated parameters are quite different whether you account for the large gaps in data. When you void large gaps, the mean step length for both states is higher, and the turn angle concentration parameters is lower for both states (i.e., more tortuous). The fact that the parameters change for both states, suggests that the large gaps skewed the parameterisation of both states.
Another strategy to deal with larger gaps is to segment the tracks with a new individual ID. This may be particularly appropriate for gaps where we may reasonably expect that the the underlying states are effectively independent of one another. Specifically, we may ask, over what period of time does the behaviour of the animal affect the subsequent behaviour. In this case, we may expect that the behaviour of a narwhal depends on the behaviour over the proceeding several hours, however is independent after 24 hours. We can split the tracks for gaps larger than a predetermined threshold by iterating the ID column. We will not implement this approach in this tutorial, however, it can be done using the following code:
gap_thresh <- 3*60 # in hours (3h for illustration)
# gaps no more than 6h per segment
new_ID <- (tracks_gps$dt > gap_thresh | tracks_gps$ID != lag(tracks_gps$ID)) %>% # if dif.time > gap_thresh or new ID
replace_na(TRUE) %>% # replace first NA with ID = 1
cumsum() %>% # iterate ID
paste(tracks_gps$ID, ., sep = "_")
# then smaller gaps <= gap_thresh can be interpolated with crawlWrapThe last approach to estimate missing locations and regularise the data is multiple imputation (Hooten et al. 2017; McClintock 2017). Multiple imputation works by first fitting a CTCRW model to the original data, second, drawing (i.e., simulating) a number of realisations of the position process based on the CTCRW model, third, fitting HMMs to each of the simulated realisations, and finally, pooling the estimated parameters.
momentuHMM has several functions to implement multiple
imputation. The function MIfitHMMW can be used both to
simulate realisations of a fitted CTCRW and fit HMMs to each one. The
number of simulations is specified with nSims. We can
simulate realisations without fitting HMMs by setting
fit = FALSE. Here, let’s use the CTCRW model that we fit in
section @ref(stat_int) to simulate 4 tracks using MIfitHMMW
and plot them over the original track.
set.seed(1)
# simulate 4 realisations of the 10 min GPS CTCRW model
MI_sim_gps <- MIfitHMM(crw_gps_10, nSims = 4, fit = FALSE)## Drawing 4 realizations from the position process using crawl...
## DONE
## DONE
# plot locations for first narwhal
# filter first ID from original data
track <- tracks_gps %>%
mutate(x = st_coordinates(tracks_gps)[,"X"],
y = st_coordinates(tracks_gps)[,"Y"]) %>%
filter(ID == "T172062")
# filter first ID for each simulation
sim_tracks <- lapply(MI_sim_gps$miData, function(x){
filter(x, ID == "T172062")})
# plot original track for first narwhal
plot(track$x, track$y, col = "red", xlab = "x", ylab = "y", asp = 1, type = 'l')
# plot each simulated track
mute <- mapply(function(data, col){
points(y~x, data = data, col = col, type = 'l')
}, data = sim_tracks, col = list("cadetblue1", "cadetblue2", "cadetblue3", "cadetblue4"), SIMPLIFY = FALSE)
Notice how in some areas the different simulations have generally good
agreement in the likely location during gaps, while in others they
diverge. Multiple imputation can be particularly powerful if we want to
incorporate environmental variables, as spatially explicit variables can
be extracted for each simulated track to sample the most likely
conditions encountered by the animal.
As you can imagine, fitting multiple imputations can take quite a bit
of time. We can speed up the computation by simulating/fitting each
track in parallel using the ncores argument. We can
identify the number of cores our PC has using
parallel::detectCores().
# count number of available cores and use 60% of available cores
ncores <- parallel::detectCores()
ncores <- round(ncores*0.6)Next, let’s simulate 12 realisations of the CTCRW and fit HMMs to each in parallel.
set.seed(1)
# fit multiple imputation HMM to the 10 min GPS crawl model
HMM_gps_MI <- MIfitHMM(crw_gps_10, nSims = 12, ncores = ncores,
nbStates = 2, dist = dist, Par0 = Par0, fit = TRUE)## Drawing 12 realizations from the position process using crawl...
## Running simulator in parallel...
## [ =================================------------------------------------------------------------------- ] 33%
[ ===================================================================--------------------------------- ] 67%
[ ==================================================================================================== ] 100%
##
## Drawing imputations in parallel...
## [ ========-------------------------------------------------------------------------------------------- ] 8%
[ =================----------------------------------------------------------------------------------- ] 17%
[ =========================--------------------------------------------------------------------------- ] 25%
[ =================================------------------------------------------------------------------- ] 33%
[ ==========================================---------------------------------------------------------- ] 42%
[ ==================================================-------------------------------------------------- ] 50%
[ ==========================================================------------------------------------------ ] 58%
[ ===================================================================--------------------------------- ] 67%
[ ===========================================================================------------------------- ] 75%
[ ===================================================================================----------------- ] 83%
[ ============================================================================================-------- ] 92%
[ ==================================================================================================== ] 100%
## Fitting 12 realizations of the position process using fitHMM...
## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
##
## step ~ gamma(mean=~1, sd=~1)
## angle ~ vm(concentration=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## Fitting 12 imputations in parallel...
## [ ========-------------------------------------------------------------------------------------------- ] 8%
[ =================----------------------------------------------------------------------------------- ] 17%
[ =========================--------------------------------------------------------------------------- ] 25%
[ =================================------------------------------------------------------------------- ] 33%
[ ==========================================---------------------------------------------------------- ] 42%
[ ==================================================-------------------------------------------------- ] 50%
[ ==========================================================------------------------------------------ ] 58%
[ ===================================================================--------------------------------- ] 67%
[ ===========================================================================------------------------- ] 75%
[ ===================================================================================----------------- ] 83%
[ ============================================================================================-------- ] 92%
[ ==================================================================================================== ] 100%
## Decoding state sequences for each imputation...
## [ ========-------------------------------------------------------------------------------------------- ] 8%
[ =================----------------------------------------------------------------------------------- ] 17%
[ =========================--------------------------------------------------------------------------- ] 25%
[ =================================------------------------------------------------------------------- ] 33%
[ ==========================================---------------------------------------------------------- ] 42%
[ ==================================================-------------------------------------------------- ] 50%
[ ==========================================================------------------------------------------ ] 58%
[ ===================================================================--------------------------------- ] 67%
[ ===========================================================================------------------------- ] 75%
[ ===================================================================================----------------- ] 83%
[ ============================================================================================-------- ] 92%
[ ==================================================================================================== ] 100%
## Decoding state probabilities for each imputation...
## [ ========-------------------------------------------------------------------------------------------- ] 8%
[ =================----------------------------------------------------------------------------------- ] 17%
[ =========================--------------------------------------------------------------------------- ] 25%
[ =================================------------------------------------------------------------------- ] 33%
[ ==========================================---------------------------------------------------------- ] 42%
[ ==================================================-------------------------------------------------- ] 50%
[ ==========================================================------------------------------------------ ] 58%
[ ===================================================================--------------------------------- ] 67%
[ ===========================================================================------------------------- ] 75%
[ ===================================================================================----------------- ] 83%
[ ============================================================================================-------- ] 92%
[ ==================================================================================================== ] 100%
## Calculating location 95% error ellipses... DONE
When plotting multiple imputed data,
momentuHMMwill include
estimated location error ellipses in the plots.
To better illustrate the effectiveness of multiple imputation in recovering states we can randomly filter locations then compare state recovery from a single most likely imputed track to states from multiple-imputed tracks. First, let’s randomly filter 98 locations for each narwhal (in addition to the first or last locations so that the path length is the same),
set.seed(1)
# number of rows for each narwhal
tracks_gps_sub <- tracks_gps_ls %>%
lapply(function(x){
# number of rows for each narwhal
nrow <- nrow(x)
# rows to sample sorted chronologically
sub <- sort(c(1, nrow, # first, last,
sample(2:(nrow-1), 98))) # random 98 from the middle
# sample selection
x[sub,]
}) %>%
# convert back to a single data frame
do.call(rbind, .)Next, lets fit a CTCRW to the sampled data (100 locs pre narwhal).
## Fitting 3 track(s) using crawl::crwMLE...
## DONE
##
## Predicting locations (and uncertainty) at 10 mins time steps for 3 track(s) using crawl::crwPredict... DONE
# check that all parameters were estimated
crw_gps_sub$crwFits %>%
lapply(function(x){
x[c("par","se","ci")] %>% # get estimated values
unlist() %>% # unlist
is.nan() %>% # identify values that failed to estimate
sum() == 0 # count failed estimates and ensure there are 0
})## $T172062
## [1] TRUE
##
## $T172064
## [1] TRUE
##
## $T172066
## [1] TRUE
Next, we can predict the most likely track using
prepData on the CTCRW model, fit an HMM to this data using
fitHMM, and calculate the proportion of decoded states that
match the states predicted from the full data set.
# prep data
prep_gps_sub <- prepData(data = crw_gps_sub)
# Fit a 2 state HMM
set.seed(1)
HMM_gps_sub <- fitHMM(prep_gps_sub, nbState = 2, dist = dist, Par0 = Par0)## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
## step ~ gamma(mean=~1, sd=~1)
## angle ~ vm(concentration=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## [1] 0.4924709
In this case, it appears that there is only a \(49\%\) overlap between the single imputation based on 300 sampled data points compared to HMM based on the 1492 data points in the original data. Given that there are only two states, \(49\%\) is about what we would expect the model to estimate by random chance at all.
As a comparison, let`s generate and fit 100 simulated tracks from the CTCRW fit to the sampled 300 locations, then compare the states. With 100 simulations, the benefits of parallel processing become very apparent.
set.seed(1)
# fit MI HMM
HMM_gps_sub_MI <- MIfitHMM(crw_gps_sub, nSims = 100, ncores = ncores,
nbStates = 2, dist = dist, Par0 = Par0, fit = T)## Drawing 100 realizations from the position process using crawl...
## Running simulator in parallel...
## [ =================================------------------------------------------------------------------- ] 33%
[ ===================================================================--------------------------------- ] 67%
[ ==================================================================================================== ] 100%
##
## Drawing imputations in parallel...
## [ =--------------------------------------------------------------------------------------------------- ] 1%
[ ==-------------------------------------------------------------------------------------------------- ] 2%
[ ===------------------------------------------------------------------------------------------------- ] 3%
[ ====------------------------------------------------------------------------------------------------ ] 4%
[ =====----------------------------------------------------------------------------------------------- ] 5%
[ ======---------------------------------------------------------------------------------------------- ] 6%
[ =======--------------------------------------------------------------------------------------------- ] 7%
[ ========-------------------------------------------------------------------------------------------- ] 8%
[ =========------------------------------------------------------------------------------------------- ] 9%
[ ==========------------------------------------------------------------------------------------------ ] 10%
[ ===========----------------------------------------------------------------------------------------- ] 11%
[ ============---------------------------------------------------------------------------------------- ] 12%
[ =============--------------------------------------------------------------------------------------- ] 13%
[ ==============-------------------------------------------------------------------------------------- ] 14%
[ ===============------------------------------------------------------------------------------------- ] 15%
[ ================------------------------------------------------------------------------------------ ] 16%
[ =================----------------------------------------------------------------------------------- ] 17%
[ ==================---------------------------------------------------------------------------------- ] 18%
[ ===================--------------------------------------------------------------------------------- ] 19%
[ ====================-------------------------------------------------------------------------------- ] 20%
[ =====================------------------------------------------------------------------------------- ] 21%
[ ======================------------------------------------------------------------------------------ ] 22%
[ =======================----------------------------------------------------------------------------- ] 23%
[ ========================---------------------------------------------------------------------------- ] 24%
[ =========================--------------------------------------------------------------------------- ] 25%
[ ==========================-------------------------------------------------------------------------- ] 26%
[ ===========================------------------------------------------------------------------------- ] 27%
[ ============================------------------------------------------------------------------------ ] 28%
[ =============================----------------------------------------------------------------------- ] 29%
[ ==============================---------------------------------------------------------------------- ] 30%
[ ===============================--------------------------------------------------------------------- ] 31%
[ ================================-------------------------------------------------------------------- ] 32%
[ =================================------------------------------------------------------------------- ] 33%
[ ==================================------------------------------------------------------------------ ] 34%
[ ===================================----------------------------------------------------------------- ] 35%
[ ====================================---------------------------------------------------------------- ] 36%
[ =====================================--------------------------------------------------------------- ] 37%
[ ======================================-------------------------------------------------------------- ] 38%
[ =======================================------------------------------------------------------------- ] 39%
[ ========================================------------------------------------------------------------ ] 40%
[ =========================================----------------------------------------------------------- ] 41%
[ ==========================================---------------------------------------------------------- ] 42%
[ ===========================================--------------------------------------------------------- ] 43%
[ ============================================-------------------------------------------------------- ] 44%
[ =============================================------------------------------------------------------- ] 45%
[ ==============================================------------------------------------------------------ ] 46%
[ ===============================================----------------------------------------------------- ] 47%
[ ================================================---------------------------------------------------- ] 48%
[ =================================================--------------------------------------------------- ] 49%
[ ==================================================-------------------------------------------------- ] 50%
[ ===================================================------------------------------------------------- ] 51%
[ ====================================================------------------------------------------------ ] 52%
[ =====================================================----------------------------------------------- ] 53%
[ ======================================================---------------------------------------------- ] 54%
[ =======================================================--------------------------------------------- ] 55%
[ ========================================================-------------------------------------------- ] 56%
[ =========================================================------------------------------------------- ] 57%
[ ==========================================================------------------------------------------ ] 58%
[ ===========================================================----------------------------------------- ] 59%
[ ============================================================---------------------------------------- ] 60%
[ =============================================================--------------------------------------- ] 61%
[ ==============================================================-------------------------------------- ] 62%
[ ===============================================================------------------------------------- ] 63%
[ ================================================================------------------------------------ ] 64%
[ =================================================================----------------------------------- ] 65%
[ ==================================================================---------------------------------- ] 66%
[ ===================================================================--------------------------------- ] 67%
[ ====================================================================-------------------------------- ] 68%
[ =====================================================================------------------------------- ] 69%
[ ======================================================================------------------------------ ] 70%
[ =======================================================================----------------------------- ] 71%
[ ========================================================================---------------------------- ] 72%
[ =========================================================================--------------------------- ] 73%
[ ==========================================================================-------------------------- ] 74%
[ ===========================================================================------------------------- ] 75%
[ ============================================================================------------------------ ] 76%
[ =============================================================================----------------------- ] 77%
[ ==============================================================================---------------------- ] 78%
[ ===============================================================================--------------------- ] 79%
[ ================================================================================-------------------- ] 80%
[ =================================================================================------------------- ] 81%
[ ==================================================================================------------------ ] 82%
[ ===================================================================================----------------- ] 83%
[ ====================================================================================---------------- ] 84%
[ =====================================================================================--------------- ] 85%
[ ======================================================================================-------------- ] 86%
[ =======================================================================================------------- ] 87%
[ ========================================================================================------------ ] 88%
[ =========================================================================================----------- ] 89%
[ ==========================================================================================---------- ] 90%
[ ===========================================================================================--------- ] 91%
[ ============================================================================================-------- ] 92%
[ =============================================================================================------- ] 93%
[ ==============================================================================================------ ] 94%
[ ===============================================================================================----- ] 95%
[ ================================================================================================---- ] 96%
[ =================================================================================================--- ] 97%
[ ==================================================================================================-- ] 98%
[ ===================================================================================================- ] 99%
[ ==================================================================================================== ] 100%
## Fitting 100 realizations of the position process using fitHMM...
## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
##
## step ~ gamma(mean=~1, sd=~1)
## angle ~ vm(concentration=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## Fitting 100 imputations in parallel...
## [ =--------------------------------------------------------------------------------------------------- ] 1%
[ ==-------------------------------------------------------------------------------------------------- ] 2%
[ ===------------------------------------------------------------------------------------------------- ] 3%
[ ====------------------------------------------------------------------------------------------------ ] 4%
[ =====----------------------------------------------------------------------------------------------- ] 5%
[ ======---------------------------------------------------------------------------------------------- ] 6%
[ =======--------------------------------------------------------------------------------------------- ] 7%
[ ========-------------------------------------------------------------------------------------------- ] 8%
[ =========------------------------------------------------------------------------------------------- ] 9%
[ ==========------------------------------------------------------------------------------------------ ] 10%
[ ===========----------------------------------------------------------------------------------------- ] 11%
[ ============---------------------------------------------------------------------------------------- ] 12%
[ =============--------------------------------------------------------------------------------------- ] 13%
[ ==============-------------------------------------------------------------------------------------- ] 14%
[ ===============------------------------------------------------------------------------------------- ] 15%
[ ================------------------------------------------------------------------------------------ ] 16%
[ =================----------------------------------------------------------------------------------- ] 17%
[ ==================---------------------------------------------------------------------------------- ] 18%
[ ===================--------------------------------------------------------------------------------- ] 19%
[ ====================-------------------------------------------------------------------------------- ] 20%
[ =====================------------------------------------------------------------------------------- ] 21%
[ ======================------------------------------------------------------------------------------ ] 22%
[ =======================----------------------------------------------------------------------------- ] 23%
[ ========================---------------------------------------------------------------------------- ] 24%
[ =========================--------------------------------------------------------------------------- ] 25%
[ ==========================-------------------------------------------------------------------------- ] 26%
[ ===========================------------------------------------------------------------------------- ] 27%
[ ============================------------------------------------------------------------------------ ] 28%
[ =============================----------------------------------------------------------------------- ] 29%
[ ==============================---------------------------------------------------------------------- ] 30%
[ ===============================--------------------------------------------------------------------- ] 31%
[ ================================-------------------------------------------------------------------- ] 32%
[ =================================------------------------------------------------------------------- ] 33%
[ ==================================------------------------------------------------------------------ ] 34%
[ ===================================----------------------------------------------------------------- ] 35%
[ ====================================---------------------------------------------------------------- ] 36%
[ =====================================--------------------------------------------------------------- ] 37%
[ ======================================-------------------------------------------------------------- ] 38%
[ =======================================------------------------------------------------------------- ] 39%
[ ========================================------------------------------------------------------------ ] 40%
[ =========================================----------------------------------------------------------- ] 41%
[ ==========================================---------------------------------------------------------- ] 42%
[ ===========================================--------------------------------------------------------- ] 43%
[ ============================================-------------------------------------------------------- ] 44%
[ =============================================------------------------------------------------------- ] 45%
[ ==============================================------------------------------------------------------ ] 46%
[ ===============================================----------------------------------------------------- ] 47%
[ ================================================---------------------------------------------------- ] 48%
[ =================================================--------------------------------------------------- ] 49%
[ ==================================================-------------------------------------------------- ] 50%
[ ===================================================------------------------------------------------- ] 51%
[ ====================================================------------------------------------------------ ] 52%
[ =====================================================----------------------------------------------- ] 53%
[ ======================================================---------------------------------------------- ] 54%
[ =======================================================--------------------------------------------- ] 55%
[ ========================================================-------------------------------------------- ] 56%
[ =========================================================------------------------------------------- ] 57%
[ ==========================================================------------------------------------------ ] 58%
[ ===========================================================----------------------------------------- ] 59%
[ ============================================================---------------------------------------- ] 60%
[ =============================================================--------------------------------------- ] 61%
[ ==============================================================-------------------------------------- ] 62%
[ ===============================================================------------------------------------- ] 63%
[ ================================================================------------------------------------ ] 64%
[ =================================================================----------------------------------- ] 65%
[ ==================================================================---------------------------------- ] 66%
[ ===================================================================--------------------------------- ] 67%
[ ====================================================================-------------------------------- ] 68%
[ =====================================================================------------------------------- ] 69%
[ ======================================================================------------------------------ ] 70%
[ =======================================================================----------------------------- ] 71%
[ ========================================================================---------------------------- ] 72%
[ =========================================================================--------------------------- ] 73%
[ ==========================================================================-------------------------- ] 74%
[ ===========================================================================------------------------- ] 75%
[ ============================================================================------------------------ ] 76%
[ =============================================================================----------------------- ] 77%
[ ==============================================================================---------------------- ] 78%
[ ===============================================================================--------------------- ] 79%
[ ================================================================================-------------------- ] 80%
[ =================================================================================------------------- ] 81%
[ ==================================================================================------------------ ] 82%
[ ===================================================================================----------------- ] 83%
[ ====================================================================================---------------- ] 84%
[ =====================================================================================--------------- ] 85%
[ ======================================================================================-------------- ] 86%
[ =======================================================================================------------- ] 87%
[ ========================================================================================------------ ] 88%
[ =========================================================================================----------- ] 89%
[ ==========================================================================================---------- ] 90%
[ ===========================================================================================--------- ] 91%
[ ============================================================================================-------- ] 92%
[ =============================================================================================------- ] 93%
[ ==============================================================================================------ ] 94%
[ ===============================================================================================----- ] 95%
[ ================================================================================================---- ] 96%
[ =================================================================================================--- ] 97%
[ ==================================================================================================-- ] 98%
[ ===================================================================================================- ] 99%
[ ==================================================================================================== ] 100%
## Decoding state sequences for each imputation...
## [ =--------------------------------------------------------------------------------------------------- ] 1%
[ ==-------------------------------------------------------------------------------------------------- ] 2%
[ ===------------------------------------------------------------------------------------------------- ] 3%
[ ====------------------------------------------------------------------------------------------------ ] 4%
[ =====----------------------------------------------------------------------------------------------- ] 5%
[ ======---------------------------------------------------------------------------------------------- ] 6%
[ =======--------------------------------------------------------------------------------------------- ] 7%
[ ========-------------------------------------------------------------------------------------------- ] 8%
[ =========------------------------------------------------------------------------------------------- ] 9%
[ ==========------------------------------------------------------------------------------------------ ] 10%
[ ===========----------------------------------------------------------------------------------------- ] 11%
[ ============---------------------------------------------------------------------------------------- ] 12%
[ =============--------------------------------------------------------------------------------------- ] 13%
[ ==============-------------------------------------------------------------------------------------- ] 14%
[ ===============------------------------------------------------------------------------------------- ] 15%
[ ================------------------------------------------------------------------------------------ ] 16%
[ =================----------------------------------------------------------------------------------- ] 17%
[ ==================---------------------------------------------------------------------------------- ] 18%
[ ===================--------------------------------------------------------------------------------- ] 19%
[ ====================-------------------------------------------------------------------------------- ] 20%
[ =====================------------------------------------------------------------------------------- ] 21%
[ ======================------------------------------------------------------------------------------ ] 22%
[ =======================----------------------------------------------------------------------------- ] 23%
[ ========================---------------------------------------------------------------------------- ] 24%
[ =========================--------------------------------------------------------------------------- ] 25%
[ ==========================-------------------------------------------------------------------------- ] 26%
[ ===========================------------------------------------------------------------------------- ] 27%
[ ============================------------------------------------------------------------------------ ] 28%
[ =============================----------------------------------------------------------------------- ] 29%
[ ==============================---------------------------------------------------------------------- ] 30%
[ ===============================--------------------------------------------------------------------- ] 31%
[ ================================-------------------------------------------------------------------- ] 32%
[ =================================------------------------------------------------------------------- ] 33%
[ ==================================------------------------------------------------------------------ ] 34%
[ ===================================----------------------------------------------------------------- ] 35%
[ ====================================---------------------------------------------------------------- ] 36%
[ =====================================--------------------------------------------------------------- ] 37%
[ ======================================-------------------------------------------------------------- ] 38%
[ =======================================------------------------------------------------------------- ] 39%
[ ========================================------------------------------------------------------------ ] 40%
[ =========================================----------------------------------------------------------- ] 41%
[ ==========================================---------------------------------------------------------- ] 42%
[ ===========================================--------------------------------------------------------- ] 43%
[ ============================================-------------------------------------------------------- ] 44%
[ =============================================------------------------------------------------------- ] 45%
[ ==============================================------------------------------------------------------ ] 46%
[ ===============================================----------------------------------------------------- ] 47%
[ ================================================---------------------------------------------------- ] 48%
[ =================================================--------------------------------------------------- ] 49%
[ ==================================================-------------------------------------------------- ] 50%
[ ===================================================------------------------------------------------- ] 51%
[ ====================================================------------------------------------------------ ] 52%
[ =====================================================----------------------------------------------- ] 53%
[ ======================================================---------------------------------------------- ] 54%
[ =======================================================--------------------------------------------- ] 55%
[ ========================================================-------------------------------------------- ] 56%
[ =========================================================------------------------------------------- ] 57%
[ ==========================================================------------------------------------------ ] 58%
[ ===========================================================----------------------------------------- ] 59%
[ ============================================================---------------------------------------- ] 60%
[ =============================================================--------------------------------------- ] 61%
[ ==============================================================-------------------------------------- ] 62%
[ ===============================================================------------------------------------- ] 63%
[ ================================================================------------------------------------ ] 64%
[ =================================================================----------------------------------- ] 65%
[ ==================================================================---------------------------------- ] 66%
[ ===================================================================--------------------------------- ] 67%
[ ====================================================================-------------------------------- ] 68%
[ =====================================================================------------------------------- ] 69%
[ ======================================================================------------------------------ ] 70%
[ =======================================================================----------------------------- ] 71%
[ ========================================================================---------------------------- ] 72%
[ =========================================================================--------------------------- ] 73%
[ ==========================================================================-------------------------- ] 74%
[ ===========================================================================------------------------- ] 75%
[ ============================================================================------------------------ ] 76%
[ =============================================================================----------------------- ] 77%
[ ==============================================================================---------------------- ] 78%
[ ===============================================================================--------------------- ] 79%
[ ================================================================================-------------------- ] 80%
[ =================================================================================------------------- ] 81%
[ ==================================================================================------------------ ] 82%
[ ===================================================================================----------------- ] 83%
[ ====================================================================================---------------- ] 84%
[ =====================================================================================--------------- ] 85%
[ ======================================================================================-------------- ] 86%
[ =======================================================================================------------- ] 87%
[ ========================================================================================------------ ] 88%
[ =========================================================================================----------- ] 89%
[ ==========================================================================================---------- ] 90%
[ ===========================================================================================--------- ] 91%
[ ============================================================================================-------- ] 92%
[ =============================================================================================------- ] 93%
[ ==============================================================================================------ ] 94%
[ ===============================================================================================----- ] 95%
[ ================================================================================================---- ] 96%
[ =================================================================================================--- ] 97%
[ ==================================================================================================-- ] 98%
[ ===================================================================================================- ] 99%
[ ==================================================================================================== ] 100%
## Decoding state probabilities for each imputation...
## [ =--------------------------------------------------------------------------------------------------- ] 1%
[ ==-------------------------------------------------------------------------------------------------- ] 2%
[ ===------------------------------------------------------------------------------------------------- ] 3%
[ ====------------------------------------------------------------------------------------------------ ] 4%
[ =====----------------------------------------------------------------------------------------------- ] 5%
[ ======---------------------------------------------------------------------------------------------- ] 6%
[ =======--------------------------------------------------------------------------------------------- ] 7%
[ ========-------------------------------------------------------------------------------------------- ] 8%
[ =========------------------------------------------------------------------------------------------- ] 9%
[ ==========------------------------------------------------------------------------------------------ ] 10%
[ ===========----------------------------------------------------------------------------------------- ] 11%
[ ============---------------------------------------------------------------------------------------- ] 12%
[ =============--------------------------------------------------------------------------------------- ] 13%
[ ==============-------------------------------------------------------------------------------------- ] 14%
[ ===============------------------------------------------------------------------------------------- ] 15%
[ ================------------------------------------------------------------------------------------ ] 16%
[ =================----------------------------------------------------------------------------------- ] 17%
[ ==================---------------------------------------------------------------------------------- ] 18%
[ ===================--------------------------------------------------------------------------------- ] 19%
[ ====================-------------------------------------------------------------------------------- ] 20%
[ =====================------------------------------------------------------------------------------- ] 21%
[ ======================------------------------------------------------------------------------------ ] 22%
[ =======================----------------------------------------------------------------------------- ] 23%
[ ========================---------------------------------------------------------------------------- ] 24%
[ =========================--------------------------------------------------------------------------- ] 25%
[ ==========================-------------------------------------------------------------------------- ] 26%
[ ===========================------------------------------------------------------------------------- ] 27%
[ ============================------------------------------------------------------------------------ ] 28%
[ =============================----------------------------------------------------------------------- ] 29%
[ ==============================---------------------------------------------------------------------- ] 30%
[ ===============================--------------------------------------------------------------------- ] 31%
[ ================================-------------------------------------------------------------------- ] 32%
[ =================================------------------------------------------------------------------- ] 33%
[ ==================================------------------------------------------------------------------ ] 34%
[ ===================================----------------------------------------------------------------- ] 35%
[ ====================================---------------------------------------------------------------- ] 36%
[ =====================================--------------------------------------------------------------- ] 37%
[ ======================================-------------------------------------------------------------- ] 38%
[ =======================================------------------------------------------------------------- ] 39%
[ ========================================------------------------------------------------------------ ] 40%
[ =========================================----------------------------------------------------------- ] 41%
[ ==========================================---------------------------------------------------------- ] 42%
[ ===========================================--------------------------------------------------------- ] 43%
[ ============================================-------------------------------------------------------- ] 44%
[ =============================================------------------------------------------------------- ] 45%
[ ==============================================------------------------------------------------------ ] 46%
[ ===============================================----------------------------------------------------- ] 47%
[ ================================================---------------------------------------------------- ] 48%
[ =================================================--------------------------------------------------- ] 49%
[ ==================================================-------------------------------------------------- ] 50%
[ ===================================================------------------------------------------------- ] 51%
[ ====================================================------------------------------------------------ ] 52%
[ =====================================================----------------------------------------------- ] 53%
[ ======================================================---------------------------------------------- ] 54%
[ =======================================================--------------------------------------------- ] 55%
[ ========================================================-------------------------------------------- ] 56%
[ =========================================================------------------------------------------- ] 57%
[ ==========================================================------------------------------------------ ] 58%
[ ===========================================================----------------------------------------- ] 59%
[ ============================================================---------------------------------------- ] 60%
[ =============================================================--------------------------------------- ] 61%
[ ==============================================================-------------------------------------- ] 62%
[ ===============================================================------------------------------------- ] 63%
[ ================================================================------------------------------------ ] 64%
[ =================================================================----------------------------------- ] 65%
[ ==================================================================---------------------------------- ] 66%
[ ===================================================================--------------------------------- ] 67%
[ ====================================================================-------------------------------- ] 68%
[ =====================================================================------------------------------- ] 69%
[ ======================================================================------------------------------ ] 70%
[ =======================================================================----------------------------- ] 71%
[ ========================================================================---------------------------- ] 72%
[ =========================================================================--------------------------- ] 73%
[ ==========================================================================-------------------------- ] 74%
[ ===========================================================================------------------------- ] 75%
[ ============================================================================------------------------ ] 76%
[ =============================================================================----------------------- ] 77%
[ ==============================================================================---------------------- ] 78%
[ ===============================================================================--------------------- ] 79%
[ ================================================================================-------------------- ] 80%
[ =================================================================================------------------- ] 81%
[ ==================================================================================------------------ ] 82%
[ ===================================================================================----------------- ] 83%
[ ====================================================================================---------------- ] 84%
[ =====================================================================================--------------- ] 85%
[ ======================================================================================-------------- ] 86%
[ =======================================================================================------------- ] 87%
[ ========================================================================================------------ ] 88%
[ =========================================================================================----------- ] 89%
[ ==========================================================================================---------- ] 90%
[ ===========================================================================================--------- ] 91%
[ ============================================================================================-------- ] 92%
[ =============================================================================================------- ] 93%
[ ==============================================================================================------ ] 94%
[ ===============================================================================================----- ] 95%
[ ================================================================================================---- ] 96%
[ =================================================================================================--- ] 97%
[ ==================================================================================================-- ] 98%
[ ===================================================================================================- ] 99%
[ ==================================================================================================== ] 100%
## Calculating location 95% error ellipses... DONE
## Decoding state sequences for each imputation...
## Decoding state probabilities for each imputation...
## Calculating location 95% error ellipses... DONE
## [1] 0.7395619
Using multiple imputation, we see that the state overlap increases from \(49\%\) to \(74\%\). on your own time, try increasing the number of simulations even more to see how the predictions change.
One issue that may come up when implementing multiple imputation is
“label switching”. Label switching is when two or more HMMs converge
effectively the same (i.e., all parameter estimates are the same), but
the assign a different state to the set of parameters. For example, one
HMM defines state 1 as slow and state 2 as fast, while a second HMM
defines state 1 as fast and state 2 as slow. This can occur if the
states have similar parameters, if the initial starting parameters are
far from the optimum, or there likelihood surface is noisy or complex.
The key issue that may arise from label switching, is when pooling the
results from multiple imputations, the summarized parameters or
estimated states would be incorrect. The best strategy to preventing
state label switching is to use pseudo-design matrices (pseudo-DM) in
tandem with the workBounds argument when repeatedly fitting
the same HMM. Pseudo-DMs operate by mapping working parameters (columns)
to natural parameters (rows). Each distribution parameter (e.g., step
length mean and sd) for each state must have precisely one corresponding
row in the DM. For example, a two state HMM, with one data stream (e.g.,
step length) that is defined by two parameters (e.g., mean and sd) must
4 rows. If the parameters for each state are completely independent, the
pseudo DM would look as follows:
stepDM <- matrix(c(1,0,0,0,
0,1,0,0,
0,0,1,0,
0,0,0,1),
nrow = 4, ncol = 4,
byrow = TRUE)
rownames(stepDM) <- c("mu_1","mu_2","sd_1","sd_2")
colnames(stepDM) <- c("mu_1_i","mu_2_i", "sd_1_i","sd_2_i")
stepDM## mu_1_i mu_2_i sd_1_i sd_2_i
## mu_1 1 0 0 0
## mu_2 0 1 0 0
## sd_1 0 0 1 0
## sd_2 0 0 0 1
Note: naming the rows and column is completely optional. Is addition, when DMs are used, the starting parameters must be specified on the working scale \((-\infty, \infty)\). If, for example, we want both states to share the same mean parameter, the DM would look as follows:
stepDM <- matrix(c(1,0,0,
1,0,0,
0,1,0,
0,0,1),
nrow = 4, ncol = 3,
byrow = TRUE)
rownames(stepDM) <- c("mu_1","mu_2","sd_1","sd_2")
colnames(stepDM) <- c("mu_1.2_i", "sd_1_i","sd_2_i")
stepDM## mu_1.2_i sd_1_i sd_2_i
## mu_1 1 0 0
## mu_2 1 0 0
## sd_1 0 1 0
## sd_2 0 0 1
In this case, the model would estimate 3 parameters:
mu_1.2_i, sd_1_i, and sd_2_i,
where mu_1.2_i would represent the mean parameter for both
state 1 and state 2.
Notably, we can constrain the natural parameters of one state to be
greater or less than another state by using DMs in tandem with
workBounds, which allows us to specify constraints on
working parameters. For example, if we want state 2 to be faster than
state 1, we can define the pseudo-DM and workBounds as
follows:
# step DM
stepDM <- matrix(c(1,0,0,0,
1,1,0,0, # Notice the two 1s
0,0,1,0,
0,0,0,1),
nrow = 4, ncol = 4,
byrow = TRUE)
rownames(stepDM) <- c("mu_1","mu_2","sd_1","sd_2")
colnames(stepDM) <- c("b1","b2", "b3","b4")
# work bound
stepWB <- matrix(c(-Inf, 0, -Inf, -Inf, # lower bounds
Inf, Inf, Inf, Inf), # upper bounds
nrow = ncol(stepDM), ncol = 2,
dimnames = list(colnames(stepDM),
c("lower","upper")))
# view
stepDM; stepWB## b1 b2 b3 b4
## mu_1 1 0 0 0
## mu_2 1 1 0 0
## sd_1 0 0 1 0
## sd_2 0 0 0 1
## lower upper
## b1 -Inf Inf
## b2 0 Inf
## b3 -Inf Inf
## b4 -Inf Inf
In this case, \(\beta_1\) would
represent the mean step length parameter for state 1, and the mean step
length for state 2 would be represented by the sum of \(\beta_1\) and \(\beta_2\), where \(\beta_2\) would be transformed to be >0.
Specifically, \(\mu_1^{(l)} =
exp(\beta_1)\) and \(\mu_2^{(l)} =
\exp(\beta_1 + \exp(\beta_2))\). Note: without specifying
workBounds, \(\beta_2\)
would not be transformed, and state 2 could be slower than state 1
(i.e., if \(\beta_2<0\), \(\exp(\beta_1) > \exp(\beta_1 +
\beta_2)\)). For completeness, the following code integrates
pseudo-DM and working constrains in an HMM fit multiple times using a
multiple-imputation HMM (we will not run this).
# create named lists for DM and workBounds
DM <- list(step = stepDM)
WB <- list(step = stepWB)
# convert starting parameters for step length to the working scale
Par0_WB <- list(step = log(Par0$step),
angle = Par0$angle)
Par0_WB$step[2] <- 0
# fit MI HMM
HMM_gps_sub_MI_WB <- MIfitHMM(crw_gps_sub, nSims = 10, ncores = ncores,
nbStates = 2, dist = dist, fit = T,
Par0 = Par0_WB, DM = DM, workBounds = WB)Note: because we specify a DM for step length, we must log-transform the starting parameters to the working scale.
Let’s import the dive data and explore it.
dives <- read.csv("data/dives.csv") %>%
mutate(time = ymd_hms(time)) %>%
filter(month(time) == 8, day(time) > 7, day(time) <= 14)
head(dives)## ID time depth dt
## 1 T172062 2017-08-08 00:00:00 1.0 1.25
## 2 T172062 2017-08-08 00:01:15 7.5 1.25
## 3 T172062 2017-08-08 00:02:30 11.0 1.25
## 4 T172062 2017-08-08 00:03:45 5.5 1.25
## 5 T172062 2017-08-08 00:05:00 3.5 1.25
## 6 T172062 2017-08-08 00:06:15 1.0 1.25
##
## 1.25 61.25 121.25 181.25 241.25 301.25 361.25
## 18547 52 16 6 1 1 1
It appears as though there are relatively few gaps, however the gaps that exist are relatively long (> 60 mins). Therefore, I suggest that we should regularize the dive data and void large gaps.
# regularise data
dives <- dives %>%
group_by(ID) %>%
# regularise time series by 1.25 min
summarise(time = seq(first(time), last(time), by = 1.25*60)) %>%
# merge regularised time with original dive
left_join(dives, by = c("ID", "time"))## `summarise()` has grouped output by 'ID'. You can override using the `.groups`
## argument.
Next, we have to summarise the time series into concrete data streams
that can be modelled in the HMM. We can use the diveMove
package to identify individual dives and calculate dive statistics. We
must first convert the depth data to the TDR class for each
whale. As we are working with multiple whales, we will use
lapply to apply the createTDR to each whale –
we will also use lapply for most of diveMove
functions.
dives_ls <- split(dives, dives$ID)
dive_TDR <- lapply(dives_ls, function(data){
createTDR(time = data$time, depth = data$depth, dtime = 1.25*60, file = "data/dives.csv")
})
# generate interactive plot
plotTDR(dive_TDR[[1]]) # note: try zooming in to part of the diveNext, we must use the calibrateDepth to calibrate the
depth data. This function identifies wet and dry periods (for animals
that haul out), applies a zero-offset correction (ZOC), and identifies
all dives in the record and their phases. ZOC can be done using either
the offset or filter method. In this case, we
will assume the depth data is accurate and does not require an offset.
We should also specify dive.thr, which represents the
threshold depth below which an underwater phase should be considered a
dive – in this case, we will set it at 8 m. There are many other
optional parameters to identify dives that we will not get into in this
tutorial (See vignette("diveMove") for details).
# calibrate TDR and identify dive phases
dive_TDR_calib <- lapply(X = dive_TDR, FUN = calibrateDepth, zoc.method = "offset", offset = 0, dive.thr = 8)## Record is truncated at the beginning and at the end
## 39 phases detected
## 818 dives detected
## Record is truncated at the beginning and at the end
## 75 phases detected
## 484 dives detected
## Record is truncated at the beginning and at the end
## 41 phases detected
## 628 dives detected
In the interactive plot, try zooming into one of the dives and hover mouse over plot to preview the phases identified by `calibrateDepth’. The phases identified correspond to descent (D), descent/bottom (DB), bottom (B), bottom/ascent (BA), ascent (A), and surface (X). We can plot a specific dives as follows:
Now, we can calculate summary statistics for each dive using the
function diveStats. There are many dive metrics that are
estimated, and which ones to retain are species, data, and question
specific. In this case, we will retain 8 from those calculated by
diveStats: dive time, bottom time, maximum depth, bottom
distance (measure of “wiggling while at the bottom), post-dive
duration.
# calculate dive stats and add dive.id to each dive
dive_sum <- lapply(dive_TDR_calib, function(data){
mutate(diveStats(data, depth.deriv = FALSE), dive.id = row_number()) %>%
dplyr::select(dive.id, divetim, botttim, maxdep, bottdist, postdive.dur)}) # select variables of interest
# add dive.id with depth data to each depth record
dives_ls <- mapply(function(TDR, dives){
mutate(TDR, dive.id = dives@dive.activity$dive.id)
}, TDR = dives_ls, dives = dive_TDR_calib, SIMPLIFY = F)
# join TDR data with dive summary data
dives_ls <- mapply(function(TDR, dive_sum){
left_join(TDR, dive_sum, by = "dive.id")
}, TDR = dives_ls, dive_sum = dive_sum, SIMPLIFY = F)
# convert dive_ls back to a data.frame
dives_df <- do.call(rbind, dives_ls)Next, we will replace any bottom time of a valid dive (dive.id > 0) to 75 s, since at least some time must be spent at the bottom. Then, we will calculate one additional metric as a proxy for dive shape; the ratio of bottom time to dive time. Dives where \(\leq20\%\) of the time is spent at the bottom represent V-shaped dives, U-shaped dives are represented when \(>20\) and \(\leq50\%\) is spent at the bottom, and square dives are represented when \(>50\%\) of the time is spent at the bottom.
# replace NA bottom time of valid dives
dives_df$botttim[dives_df$dive.id > 0 & is.na(dives_df$botttim)] <- 75
# calculate proportion time at bottom
dives_df <- dives_df %>%
mutate(propbott = botttim/divetim)
# remove "dives" with no duration
dives_df <- dives_df %>%
filter(!(dive.id > 0 & is.na(divetim)))The next issue is that the dive data is at a different temporal resolution (75 s) than the location data (10 min). There are two options to include both data streams in the same HMM. First, we can choose to implement a hierarchical HMM, however, this is more complicated, and will be covered in tomorrow’s tutorial. The second, which we will use here, is to summarise the depth/dive data to a 10 min resolution and include them as additional data streams with step length and turning angle.
dives_sum <- dives_df %>%
# round time to same interval as location data
mutate(time = floor_date(time, "10 min")) %>%
# group rows by time interval
group_by(ID, time) %>%
# summarise data
summarise(NA_t = sum(is.na(depth))*1.25,
surf_t = sum(dive.id == 0)*1.25,
mean_dep = ifelse(NA_t == 10, NA, mean(na.rm = T, depth)),
max_dep = ifelse(NA_t == 10, NA, max(na.rm = T, depth)),
dive_t = ifelse(NA_t == 10, NA, sum(na.rm = T, divetim)),
bott_t = ifelse(dive_t < 5, NA, sum(na.rm = T, botttim)),
prop_bott = ifelse(dive_t < 5, NA, max(na.rm = T, propbott)),
max_dive_dep = ifelse(dive_t < 5, NA, max(na.rm = T, maxdep)),
bott_dist = ifelse(dive_t < 5, NA, max(na.rm = T, bottdist)),
post_dive_dur = ifelse(dive_t < 5, NA, max(na.rm = T, postdive.dur))) %>%
# remove -Inf values (typically error)
filter(!is.infinite(dive_t) & !is.infinite(bott_dist))## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## Warning in max(na.rm = T, bottdist): no non-missing arguments to max; returning
## -Inf
## `summarise()` has grouped output by 'ID'. You can override using the `.groups`
## argument.
## # A tibble: 6 × 12
## # Groups: ID [1]
## ID time NA_t surf_t mean_dep max_dep dive_t bott_t
## <chr> <dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 T172062 2017-08-08 00:10:00 0 3.75 17.3 31 1875 375
## 2 T172062 2017-08-08 00:20:00 0 6.25 7.56 20 1875 1125
## 3 T172062 2017-08-08 00:30:00 0 0 109. 123 7200 4200
## 4 T172062 2017-08-08 00:40:00 0 5 20.9 88 2100 1200
## 5 T172062 2017-08-08 00:50:00 0 5 9.94 24.5 1275 450
## 6 T172062 2017-08-08 01:00:00 0 1.25 53.2 76.5 4200 1575
## # … with 4 more variables: prop_bott <dbl>, max_dive_dep <dbl>,
## # bott_dist <dbl>, post_dive_dur <dbl>
Note, we used 2 different ifelse statements. First,
ifelse(NA_t == 10, ...) ensured that we only calculate mean
and maximum depth for periods where we have at least 1.25 min of depth
data. Second, ifelse(surf_t > 5, ...) ensured that we
only calculate dive metrics when at least half of the 10 min interval is
spent in a dive, otherwise the animal is assumed to be at the surface
and its dive metric is NA.
Any of the eight variables can be used as data streams in the HMM, however, including too many would significantly increase the number of parameters to estimate, and consequently computation time. Therefore, we must select which variables to use, which there are different several approaches: 1. Variable can be selected using expert on the species behaviour and research question. 2. We preferably want to avoid variable with a lot of missing data. 3. The data streams should exhibit variation and evidence of clustering or multi-modality that may be tied to the underlying behaviours. 4. Data streams with no variation or conversely are very noisy contain little information on underlying behaviour. 5. We want to avoid variables that are overdispersed and which would be difficult to describe using a statistical distribution. 6. Given that they are tied to autocorrelated behaviours, they too should exhibit some – but not too much – autocorrelation. 7. One of HMM’s key assumptions that the data stream are be independent of each other, therefore, we should avoid select highly co-linear data streams.
Assuming all the variables are biologically relevant, let’s look at
structure in the data. I will also add a subjective score based on the
results (first number after # symbol).
vars <- c("mean_dep","max_dep","dive_t","bott_t","prop_bott",
"max_dive_dep","bott_dist","post_dive_dur")
# First, missing data
dives_sum %>%
summarise(mean_dep = sum(is.na(mean_dep)), # 2
max_dep = sum(is.na(max_dep)), # 2
dive_t = sum(is.na(dive_t)), # 2
bott_t = sum(is.na(bott_t)), # 1
prop_bott = sum(is.na(prop_bott)), # 1
max_dive_dep = sum(is.na(max_dive_dep)), # 1
bott_dist = sum(is.na(bott_dist)), # 1
post_dive_dur = sum(is.na(post_dive_dur))) # 1## # A tibble: 3 × 9
## ID mean_dep max_dep dive_t bott_t prop_bott max_dive_dep bott_dist
## <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 T172062 120 120 120 216 216 216 216
## 2 T172064 366 366 366 476 476 476 476
## 3 T172066 210 210 210 359 359 359 359
## # … with 1 more variable: post_dive_dur <int>
# Second, evidence of multi-modality, balanced variation, and no over-dispersion
# as the dive variables are zero-bound, applying a log transformation makes it easier to see structure in the distirbution
# dives_sum %>%
# ungroup() %>%
# dplyr::select(vars) %>%
# apply(2, hist, 100)
hist(log(dives_sum$mean_dep), 100) # 2 good structure, high dispersion## Warning in logit(dives_sum$prop_bott): proportions remapped to (0.025, 0.975)
# Third, balanced autocorrelation
dives_sum_filter <- filter(dives_sum, ID == "T172062")
acf(dives_sum_filter$mean_dep, na.action = na.pass) # 1 ACF~8 some variability# subjective score
data.frame(var = c("mean_dep", "max_dep", "dive_t", "bott_t", "prop_bott",
"max_dive_dep", "bott_dist", "post_dive_dur"),
score = c(5, 6, 5, 1,
1, 4, 4, 1))## var score
## 1 mean_dep 5
## 2 max_dep 6
## 3 dive_t 5
## 4 bott_t 1
## 5 prop_bott 1
## 6 max_dive_dep 4
## 7 bott_dist 4
## 8 post_dive_dur 1
From my subjective interpretation of these outputs, I think the five most promising variables to include are maximum depth, dive time, mean depth, maximum dive depth, and bottom distribution. Next, let’s merge the dive data streams the location data.
tracks_dives <- left_join(prep_tracks_gps_crw_NAgaps,
dives_sum[,c("ID", "time", "max_dep", "dive_t",
"mean_dep", "max_dive_dep", "bott_dist")],
by = c("ID", "time"))Now, we must check for co-linearity between the data streams to select 1–3 to use in the HMM. We can check co-linearity using the Pearson’s correlation matrix.
# calculate and plot check Paerson's correlation matrix
tracks_dives[,c("max_dep", "dive_t", "mean_dep", "max_dive_dep",
"bott_dist", "step", "angle")] %>%
na.omit() %>%
cor() %>%
corrplot(method="number")step and angle seem quit independent from
all variables as does bott_dist. Unfortunately, there is
quite high correlation between the first four dive data streams, se we
will have to select which to use. dive_t and
mean_dep have the lowest correlation with
bott_dist and step, however
max_dep had the highest subjective score. For the purposes
of this tutorial, I will use max_dep and
bott_dist.
To get starting parameters, we can fit an HMM to each one independently. We will use the gamma distribution for both and for now will use the same starting parameters.
# identify whether there is 0 data
tracks_dives %>%
summarise(max_dep = sum(max_dep == 0, na.rm = T),
bott_dist = sum(bott_dist == 0, na.rm = T)) ## max_dep bott_dist
## 1 0 627
# therefore, we need to include zero-mass parameters for bott_dist
# starting parameters (will use same ones for both for now)
mu0 <- c(130, 180) # mean
sigma0 <- c(60, 90) # sd
zm <- c(0.1, 0.1) # zero mass, where applicable
# fit dive-only HMMs
set.seed(1)
HMM_max_dep <- fitHMM(tracks_dives, nbStates = 2, dist = list(max_dep = "gamma"),
Par0 = list(max_dep = c(mu0, sigma0)))
HMM_bott_dist <- fitHMM(tracks_dives, nbStates = 2, dist = list(bott_dist = "gamma"),
Par0 = list(bott_dist = c(mu0, sigma0, zm)))Next, we can integrate these into one HMM together with step length and turning angle. When we specify the starting parameters, we want to think about how the states may look. For example, we might expect one resident state with slower movement, lower angular concentration, deeper dives, and greater at-depth variability. The second state may be travel, with faster movement, greater angular concentration, shallower dives, and less at-depth variability.
# prep model
stateNames = c("resident", "travel")
nbStates = length(stateNames)
dist = list(step = "gamma", angle = "vm",
max_dep = "gamma", bott_dist = "gamma")
# Starting Pars
# view Pars from previous HMMs
getPar(HMM_gps_crw_NAgaps)$Par # state 1 ~ resident, state 2 ~ travel## $step
## [1] 282.3696 757.5687 156.4446 248.9912
##
## $angle
## [1] 2.473555 19.035002
## $max_dep
## [1] 30.15548 407.63055 29.06443 163.53662
## $bott_dist
## [1] 24.2681154 317.3752361 25.1739909 203.2573833 0.2211351 0.5232927
# therefore must switch parameter estimates in HMM_max_dep & HMM_bott_dist
# combine starting Pars
step0 <- getPar(HMM_gps_crw_NAgaps)$Par$step
angle0 <- getPar(HMM_gps_crw_NAgaps)$Par$angle
max_dep0 <- c(getPar(HMM_max_dep)$Par$max_dep[c(2,1)], # mu1, mu2
getPar(HMM_max_dep)$Par$max_dep[c(4,3)]) # sd1, sd2
bott_dist0 <- c(getPar(HMM_bott_dist)$Par$bott_dist[c(2,1)], # mu1, mu2
getPar(HMM_bott_dist)$Par$bott_dist[c(4,3)], # sd1, sd2
getPar(HMM_bott_dist)$Par$bott_dist[c(6,5)]) # zm1, zm2
Par0 <- list(step = step0, angle = angle0, max_dep = max_dep0, bott_dist = bott_dist0)Now, we can fit the HMM with movement and dive variables.
set.seed(1)
HMM_move_dive <- fitHMM(tracks_dives, nbStates=nbStates, stateNames=stateNames, dist=dist, Par0=Par0)Let’s see what it looks like.
## Decoding state sequence... DONE
## Computing pseudo-residuals...
The tracks look interesting. But there are some issues in the pseudo residuals.
Looking at the QQ plots the model appears to: - overestimate the
number of fast movement steps - underestimating higher turning angles -
underestimate shallow dives and over estimate deep dives - really wonky
description of bott_dist The autocorrelation functions
suggest there is remnant autocorrelation in step and
max_dep that are not well described by the model. Together,
this information suggests that there may be additional states that are
not represented. Let’s try to add one more state with an intermediate
speed, lower angle concentration, and shallow dives.
# define states
stateNames <- c("resident", "travel", "search")
nbStates <- length(stateNames)
# Starting Pars
# get Pars from last HMM_move_dive HMM
Pars <- getPar(HMM_move_dive)$Par
# combine starting Pars
step0 <- c(Pars$step[c(1,2)], mean(Pars$step[c(1,2)]), # mu
Pars$step[c(3,4)], mean(Pars$step[c(3,4)])) # sd
angle0 <- c(Pars$angle, 3)
max_dep0 <- c(Pars$max_dep[c(1,2)], 25, # mu
Pars$max_dep[c(3,4)], 10) # sd
bott_dist0 <- c(Pars$bott_dist[c(1,2)], mean(Pars$bott_dist[c(1,2)]), # mu
Pars$bott_dist[c(3,4)], mean(Pars$bott_dist[c(3,4)]), # sd
Pars$bott_dist[c(5,6)], mean(Pars$bott_dist[c(5,6)]))# zm
Par0 <- list(step = step0, angle = angle0, max_dep = max_dep0, bott_dist = bott_dist0)
# fit 3-state HMM
set.seed(1)
HMM_move_dive_3s <- fitHMM(tracks_dives, nbStates=nbStates, stateNames=stateNames, dist=dist, Par0=Par0)## Computing pseudo-residuals...
## Decoding state sequence... DONE
Interestingly, the step and angle QQ-plots
were not improved much, though the step ACF was improved.
However, compared to the 2-state model, there was a drastic improvement
in QQ-plot and ACF the max_dep data stream and marginal
improvement in the bott_dist data stream. The newly
described state appears to have very low step length, really wide
turning angle, minimal diving, and minimal at-depth activity, which may
actually be indicative of resting. The “resident” state has intermediate
speed and turning angle, but very deep dives and high at-depth activity,
suggesting it may represent foraging. Let’s try to fit one more 4-state
HMM, to try and address the remaining residuals: intermediate speed,
lower angle concentration, and more intermediate dives, which may
represent searching behaviour.
# define states
stateNames <- c("forage", "travel", "rest", "search")
nbStates <- length(stateNames)
# Starting Pars
# get Pars from last HMM_move_dive HMM
Pars <- getPar(HMM_move_dive_3s)$Par
# combine starting Pars
step0 <- c(Pars$step[c(1:3)], mean(Pars$step[3]), # mu
Pars$step[c(4:6)], mean(Pars$step[6])) # sd
angle0 <- c(Pars$angle, Pars$angle[3])
max_dep0 <- c(Pars$max_dep[c(1:2)], Pars$max_dep[3]/2, Pars$max_dep[3], # mu
Pars$max_dep[c(4:5)], Pars$max_dep[6]/2, Pars$max_dep[6]) # sd
bott_dist0 <- c(Pars$bott_dist[c(1:3)], Pars$bott_dist[3], # mu
Pars$bott_dist[c(4:6)], Pars$bott_dist[6], # sd
Pars$bott_dist[c(7:9)], Pars$bott_dist[9]) # zm
Par0 <- list(step = step0, angle = angle0, max_dep = max_dep0, bott_dist = bott_dist0)
# fit 4-state HM
set.seed(1)
HMM_move_dive_4s <- fitHMM(tracks_dives, nbStates=nbStates, stateNames=stateNames, dist=dist, Par0=Par0)## =======================================================================
## Fitting HMM with 4 states and 4 data streams
## -----------------------------------------------------------------------
## step ~ gamma(mean=~1, sd=~1)
## angle ~ vm(concentration=~1)
## max_dep ~ gamma(mean=~1, sd=~1)
## bott_dist ~ gamma(mean=~1, sd=~1, zeromass=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## Computing pseudo-residuals...
## Decoding state sequence... DONE
The QQ-plot for step is quite a bit improved, as is for
bott_dist. The ACF for step is also quite
improved. At this point, I’m not sure the model can be significantly
improved with the existing data. Although HMMs with different number of
states generally shouldn’t be compared, since AIC generally favours more
states, it is a good sanity check.
## Model AIC
## 1 HMM_move_dive_4s 70947.64
## 2 HMM_move_dive_3s 71847.07
## 3 HMM_move_dive 73358.89
Thus far, we discussed various methods to regularise your data if you
have missing locations and calculating and integrating dive metrics into
HMMs, and in yesterday’s tutorial we discussed incorporating covariates
on the state transition probability. momentuHMM also
provides the option to include covariates on the emission probability,
that is, the distribution parameters (e.g., step length mean or sd)
could depend on covariates. For example, we may predict that the
apparent swim speed (e.g., mean step length) may depend on ocean current
speed. This section will show how to integrate covariates into the
emission probabilities for mean step length and mean turning angle.
This section will illustrate how to model narwhal step length as a
function of tidal currents. The narwhals in this tutorial primarily
occupy the Eclipse sound system, where incoming tides lead to tidal
currents into the fjords, and outgoing tides lead to tidal currents out
of the fjords. Unfortunately, there is high resolution measurement data
or modelled data of tidal currents in this system. However, there are
general models based on lunar and solar cycles and broad-scale coast
lines that we can use for the purposes of this tutorial. Specifically,
we will use the calc_earthtide function from the
earthtide package to estimate horizontal displacement; as
the model does not take into account coastlines, the specific direction
of tides cannot be used, but general patterns of still water or
tide-driven currents should hold. For this section, we will use the CRW
data with voided gaps. First, we must project the data into lat long
(WGS84).
# re-project crw data to WGS84
prep_tracks_gps_crw_NAgaps_wgs84 <-
prep_tracks_gps_crw_NAgaps %>%
as.data.frame() %>% # convert from momentuHMM object to data frame
st_as_sf(coords = c("x", "y")) %>% # converts to an sf object
st_set_crs(2962) %>% # specify projection of daat
st_transform(4326) # re-project data to a lat long
# define x and y colums and remove sf geometry
prep_tracks_gps_crw_NAgaps_wgs84 <- prep_tracks_gps_crw_NAgaps_wgs84 %>%
mutate(x = st_coordinates(prep_tracks_gps_crw_NAgaps_wgs84)[,"X"],
y = st_coordinates(prep_tracks_gps_crw_NAgaps_wgs84)[,"Y"]) %>%
st_drop_geometry()At the small scale we are using, there is not much spatial variation in the tides, so for faster processing, we can just estimate tidal currents at each narwhal time step using the coordinates for the centre of Eclipse sound (\(79.7^\circ\) W, \(72.6^\circ\)N).
prep_tracks_gps_crw_NAgaps_wgs84 <- prep_tracks_gps_crw_NAgaps_wgs84 %>%
group_by(ID) %>% # group by narwhal ID as calc_earthtide assumes chronological data
mutate(htide = calc_earthtide(utc = time,
do_predict = TRUE,
method = 'horizontal_displacement',
latitude = 72.6,
longitude = -79.7)[,2])
# invert the horizontal_displacement so values are strictly positive (I'm not sure what the units for this variable are)
prep_tracks_gps_crw_NAgaps$htide <- -prep_tracks_gps_crw_NAgaps_wgs84$htide + 10
# plot tide data as afunction of time
plot(htide ~ time, prep_tracks_gps_crw_NAgaps, type = 'l')If narwhal movement occurred at a larger scale (e.g., if we studied a longer period), and we wanted to estimate tidal currents independently for each location, we could use the following code (not run):
prep_tracks_gps_crw_NAgaps_wgs84 <- prep_tracks_gps_crw_NAgaps_wgs84 %>%
rowwise() %>% # must apply calc_earthtide function independently to each row
mutate(htide = calc_earthtide(utc = time,
method = 'horizontal_displacement',
latitude = y, # use x,y coordinates from each row
longitude = x)[,2]) %>%
ungroup() # ungroup data so future analysis is not conducted rowwiseNext, we can define the model structure including the design matrix, which can be used to specify effects of covariates. Note: when specifying a design matrix with DM, starting parameters (i.e., Par0) must be specified on the working scale (i.e., -inf,inf), therefore, we apply a log transformation for the mean and sd for step.
set.seed(1)
# define state names (optional)
stateNames <- c("resident", "travel")
# define number of states
nbState <- length(stateNames)
# define distributions for each data stream
dist <- list(step = "gamma", angle = "vm")
# define step formula and design matrix
stepDM <- list(mean = ~ htide, sd = ~ 1)
DM <- list(step = stepDM)
# define starting values
mu0 <- log(c(200, 1, # state 1 step mean: intercept + slope (working scale)
500, 1)) # state 2 step mean: intercept + slope (working scale)
sigma0 <- log(c(150, 300)) # SD of the step length (working scale)
kappa0 <- c(0.1, 1) # Turning angle concentration parameter (kappa > 0)
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)Now we can fit the HMM, plot the output, and view the estimated
parameters with confidence intervals. Note, when the model has
covariates, you can include plotCI = T to include the 95%
confidence intervals on the estimated value.
# Fit a 2 state HMM
HMM_StepTides <- fitHMM(prep_tracks_gps_crw_NAgaps, nbState = nbState, dist = dist, Par0 = Par0, DM = DM, stateNames = stateNames)
# plot
plot(HMM_StepTides, ask = FALSE, plotCI = T)## Decoding state sequence... DONE
# show parameter estimates and confidence intervals
HMM_StepTides$CIbeta$step[c("est", "lower", "upper")] %>% # get estimate and CI
lapply(as.vector) %>% # convert est, lower, and upper to vectors
do.call(cbind, .) %>% # cbind est, lower, and upper into a data.frame
data.frame(names = colnames(HMM_StepTides$mle$step), .) %>% # add column for beta names
kable(digits = 3) # output table with 3 decimal places on values| names | est | lower | upper |
|---|---|---|---|
| mean_1:(Intercept) | 5.609 | 5.544 | 5.675 |
| mean_1:htide | 0.001 | -0.001 | 0.003 |
| mean_2:(Intercept) | 6.558 | 6.512 | 6.605 |
| mean_2:htide | 0.003 | 0.001 | 0.004 |
| sd_1:(Intercept) | 5.036 | 4.978 | 5.094 |
| sd_2:(Intercept) | 5.521 | 5.474 | 5.568 |
Based on the first plot and the negative and positive confidence
intervals for mean_1:htide, it does not appear there is a
relationship between tidal currents and speed for the resident state.
However, there may be a positive relationship with speed for the
travelling state. These results align with the prediction that narwhal
movement is random relative to tides during the resident state and that
movement tends to be with tidal current during the travelling state to
save energy. momentuHMM has special functions that allow us
to specify state-specific effects of covariates, namely:
state, toState, and betaCol.
Using state2(htide) in the DM for step length, we specify
that tidal currents only affect the travelling state (state 2) as
follows:
set.seed(1)
# define starting values
mu0 <- log(c(200, 500, 1)) # Mean step length (working scale)
sigma0 <- log(c(150, 300)) # Sd of the step length (working scale)
kappa0 <- c(0.1, 1) # Turning angle concentration parameter (kappa > 0)
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)
# define formula.
# state 2 mean is a function of htide
stepDM <- list(mean = ~ state2(htide), sd = ~ 1)
DM <- list(step = stepDM)
# Fit HMM
HMM_S2StepTides <- fitHMM(prep_tracks_gps_crw_NAgaps, nbState = nbState, dist = dist, Par0 = Par0, DM = DM, stateNames = stateNames)
# plot
plot(HMM_S2StepTides, ask = FALSE, plotCI = T, plotTracks = F)## Decoding state sequence... DONE
# show parameter estimates and confidence intervals
HMM_S2StepTides$CIbeta$step[c("est", "lower", "upper")] %>% # get estimate and CI
lapply(as.vector) %>% # convert to vectors
do.call(cbind, .) %>% # cbind into data.frame
data.frame(names = colnames(HMM_S2StepTides$mle$step), .) %>% # add beta names
kable(digits = 3) # output table| names | est | lower | upper |
|---|---|---|---|
| mean_1:(Intercept) | 5.644 | 5.605 | 5.684 |
| mean_2:(Intercept) | 6.566 | 6.521 | 6.611 |
| mean_2:htide | 0.003 | 0.001 | 0.004 |
| sd_1:(Intercept) | 5.039 | 4.982 | 5.097 |
| sd_2:(Intercept) | 5.521 | 5.474 | 5.568 |
Here we see that tidal effects do appear to affect step length in state 2. As the observation data is the same between this model and the model in section @ref(void_gaps), we can use AIC to compare models with and without tidal data.
## [1] 35618.32
## [1] 35779.65
## [1] 35779.41
In this case, the original model without tidal effects had the best AIC, suggesting that including tides did not explain enough additional variation to warrant the additional parameter.
For me, using the special functions (i.e.,, state and
toState, and betaCol) can get confusing.
Instead, I prefer to define pseudo-DMs explicitly so I know exactly how
each parameter is defined. The previous HMM can be identically defined
using the following code (we will not run this):
# define pseudo-design matrix
# b1 b2 b3 b4 b5
stepDM <- matrix(c(1, 0, 0, 0, 0, # mean 1 (intercept-only)
0, 1, "htide",0, 0, # mean 2 (intercept + slope*htide)
0, 0, 0, 1, 0, # sd 1 (intercept-only)
0, 0, 0, 0, 1),# sd 2 (intercept-only)
byrow = T, nrow = nbState*2,
dimnames = list(c("mean_1","mean_2","sd_1","sd_2"),
c("mean_1", "mean_2:int", "mean_2:htide", "sd_1", "sd_2")))
DM <- list(step = stepDM)
# fit HMM as in previous code chunkNote: adding dimension names is optional, but can help with
interpretation. Remember, each row of the DM corresponds with the
state-specific distribution parameters and each column represents the
beta coefficients estimated by the model. A basic HMM without any
covariates is effectively using a diagonal matrix of 1s. Multiple values
in one row are added multiplied by their respective beta coefficients
then transformed using the link function (in this case, log for mean and
SD). Mathematically, the above DM can be written as follows: \[\begin{equation} \label{eq:sl}
\begin{aligned}
\log\mu_{S,t}^{(l)} &=
\begin{cases}
\begin{aligned}
& \beta_{1} \\
& \beta_{2} + \beta_{3}r^{(\mathrm{htide})}_t \\
\end{aligned}
\end{cases}
\quad
\begin{aligned}
& \mathrm{if} \: S = \mathrm{resident}\\
& \mathrm{if} \: S = \mathrm{travel}\\
\end{aligned} \\
\log\sigma_{S,t}^{(l)} &=
\begin{cases}
\begin{aligned}
& \beta_{3} \\
& \beta_{4} \\
\end{aligned}
\end{cases}
\quad
\begin{aligned}
& \mathrm{if} \: S = \mathrm{resident}\\
& \mathrm{if} \: S = \mathrm{travel},\\
\end{aligned}
\end{aligned}
\end{equation}\]
where \(\mu_S^{(l)}\) is the
state-specific mean of step length, \(\sigma_{S}^{(l)}\) is the state-specific SD
of step length, \(\beta \in (-\infty,
\infty)\) are the beta coefficients estimated by the model, and
\(r^{(htide)}_t\) is the rate of
horizontal tide movement at time \(t\).
It is possible to include covariates on every any data stream
including turning angle. Thus far, we have been assuming that all
movement tends to have directional persistence such that the mean
turning angle in every state is \(\mu_S^{(\phi)} = 0\) and the angular
concentration is state specific (i.e., \(\kappa_S{(\phi)}\) is state-specific). If
we want to include covariates on \(\kappa\), we can do so as in the previous
section. To include covariates on the mean turning angle \(\mu_S^{(\phi)}\), we must also add
circularAngleMean = TRUE in fitHMM and also
include additional starting parameters for angle (thus far we have only
been specifying starting parameters for \(\kappa_S\)). When no covariates are
included on the mean turning angle, we are effectively describing the
movement as a correlated random walk (CRW), wherein the only factor that
determines the direction an animal moves is directional persistence
(i.e., tending to move forward in the same direction). However, most
behaviours exhibit some directional bias relative to the external
stimuli. For example, when travelling, narwhals may orient themselves in
the same direction as currents to save energy, or if killer whales are
present on a large scale (e.g., within 100 km) they may orient
themselves toward shore as suggested by Breed et
al. (2017), or if killer whales are very near, they may orient their
movement directly away from killer whales. Movement where orientation is
a combination of correlated directional persistence and bias relative to
external stimuli is known as a biased correlated random walk (BCRW).
Many interesting hypotheses can be tested by including covariates on the
mean turning angle. Directional bias can be directly toward or away from
a stimulus (e.g., resource patch, or predator location) or it may be at
some angle relative to the stimulus (e.g., bird flight \(45^\circ\) relative to wind to maximise
ground speed, polar bear crosswind movement during olfactory search, or
sea turtle navigation relative to magnetic fields). Movement with an
angled bias is known as and was the focus of my PhD and is described Togunov
et al. (2021) and Togunov
et al. (2022). In this section we will examine a basic BCRW where a
narwhal moves parallel to the estimated direction of tidal currents, and
a more complex example where they travel parallel to the nearest
coastline. We will have to use prepData slightly
differently this time and will use the tracks_gps_crw data
for clarity (i.e., we will not void gaps).
As in the previous section, we must first project the data into WGS84:
tracks_gps_crw_wgs84 <- tracks_gps_crw %>%
st_as_sf(coords = c("x", "y")) %>% # converts to an sf object
st_set_crs(2962) %>% # specify projection of daat
st_transform(4326) Second, we will use the calc_earthtide function to
estimate the direction of tides, specifically, the eastward
(e_w_displacement) and northward
(n_s_displacement) components of tidal currents. Again, for
brevity, we will only calculate the direction in the centre of Eclipse
Sound (\(79.7^\circ\) W, \(72.6^\circ\)N), as this package does not
have the resolution to estimate variability among the different fjords
at our scale.
tracks_gps_crw_wgs84 <- tracks_gps_crw_wgs84 %>%
group_by(ID) %>% # group by narwhal ID as calc_earthtide assumes chronological data
mutate(tide_u = calc_earthtide(utc = time,
method = "e_w_displacement",
latitude = 72.6,
longitude = -79.7)[,2],
tide_v = calc_earthtide(utc = time,
method = "n_s_displacement",
latitude = 72.6)[,2]) %>%
ungroup()
# convert back to EPSG2962
tracks_gps_crw_wgs84 <- tracks_gps_crw_wgs84 %>%
st_transform(2962) %>% # re-project data to a lat long
# define x and y and drop geometry
{mutate(., x = st_coordinates(.)[,"X"],
y = st_coordinates(.)[,"Y"])} %>%
st_drop_geometry()Next, we can calculate the tide direction using the
atan2 function. This calculates the direction as radians
counter-clockwise from east.
Now, when we prep the data for momuntuHMM, we must
specify that "tide_dir" is an angular covariate. Because
turning angle is calculated as the angle between the step at time \(t\) relative to the step at time \(t-1\), for the angular covariate,
momentuHMM will calculate the angle between the animal’s
step at time \(t-1\) relative to the
tides at time \(t\).
# prep data
prep_gps_crw_wgs84 <- tracks_gps_crw_wgs84 %>%
as.data.frame() %>%
prepData(type = "UTM", angleCovs = "tide_dir")As the location data is the same as the previous section, we can pull its step and angle to void large data gaps.
Before we fit the HMM, we can define the DM for turning angle. Here, I will specify the DM explicitly using pseudo-DMs.
# define turning angle pseudo-DM
# b1 b2 b3
angleDM <- matrix(c(0, 0, 0,# mean 1
"tide_dir",0, 0,# mean 2
0, 1, 0, # k 1
0, 0, 1),# k 2
byrow = T, nrow = nbState*2,
dimnames = list(c("mean_1","mean_2","k_1","k_2"),
c("mean_2:tide","k_1","k_2")))
DM <- list(angle = angleDM)In the above DM, we are effectively fixing the mean turning angle for state 1 to \(0^\circ\) and stating that the mean direction of state 2 (travel) depends on the tide direction.
Now, we will define the starting parameters for the model and fit the HMM. We must add two arguments tofitHMM. First, we must
specify for which angular data streams we want to to estimate the mean
turning angle using estAngleMean. Second, when we are
including covariates on the turning angle mean, we should also set
circularAngleMean = T. circularAngleMean will
fit a special link function based on Rivest
et al. (2016):
where \(\psi_t \in (-\pi,\pi]\) is the observed angle of stimulus at \(t\) relative to the movement bearing at \(t-1\) and \(\alpha_{1} \in (-\infty, \infty)\) is the bias coefficient in the direction \(\psi_t\). In this model, when \(\alpha_{1} = 0\), the model simplifies to a basic CRW with a turning angle mean centred at 0. However, as \(\alpha_{1}\) increases (or becomes increasingly negative), the movement becomes increasingly governed by the stimulus (i.e., toward or away from the direction of stimulus) and less by directional persistence.
set.seed(1)
# define starting values
mu0 <- c(200, 500) # Mean step length
sigma0 <- c(150, 300) # SD of the step length
angle0 <- 1 # mean 2 TA
kappa0 <- c(0.1, 1) # TA concentration parameter (kappa)
Par0 <- list(step = c(mu0, sigma0), angle = c(angle0,kappa0))
# Fit HMM
HMM_S2AngleTides <- fitHMM(prep_gps_crw_wgs84, nbState = nbState, dist = dist, Par0 = Par0, DM = DM, stateNames = stateNames, estAngleMean = list(angle = T), circularAngleMean = list(angle = T))## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
## step ~ gamma(mean=~1, sd=~1)
## angle ~ vm(mean: custom, concentration: custom)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## Decoding state sequence... DONE
# show parameter estimates and confidence intervals
HMM_S2AngleTides$CIbeta$angle[c("est", "lower", "upper")] %>% # get estimate and CI
lapply(as.vector) %>% # convert to vectors
do.call(cbind, .) %>% # cbind into data.frame
data.frame(names = colnames(HMM_S2AngleTides$mle$angle), .) %>% # add beta names
kable(digits = 3) # output table| names | est | lower | upper |
|---|---|---|---|
| mean_2:tide | 0.007 | -0.012 | 0.026 |
| k_1 | 0.823 | 0.749 | 0.896 |
| k_2 | 2.889 | 2.786 | 2.992 |
It is clear from the second plot, that the tidal direction estimated
using the earthtide package does not explain much of
observed turning angle given by the confidence intervals for
mean_2:tide and the second plot. In the second figure, the
x-axis represents the angle of tide relative to the previous step and
the y-axis is the estimated turning angle. We see that when the tide
angle is \(0^\circ\), the expected mean
turning angle is also \(0^\circ\)
(i.e., the tide is in the same direction as directional persistence and
no turn is expected). Moving to the right, as the tide angle increases
(i.e., rotates towards the left), the turning angle also increases
(i.e., also turns towards the left). Conversely, as the tide angle
decreases (i.e., right relative to the narwhal), the narwhal’s turning
angle also shifts to the right (negative values). However, thoughout the
plot, the estimated turning angle always encompasses \(0^\circ\), suggesting there is not a strong
relationship. This result makes sense as the earthtide
package does not accurately represent tides in the complex coastline of
Eclipse Sound, where waterways are angled in various directions (e.g.,
NBI, MI, TaS, and PI). #### Menotactic BCRW Our hypothesis is that
when travelling, narwhal’s movement is parallel to the coastline. In the
context of a BCRW, we can model this by defining turning angle as a
function of the direction of the nearest coastline. We can obtain the
direction of the nearest coastline using the
direction
function from the terra package, and the angle of the coast
line is perpendicular to this direction.
## [1] -297006.3 -180506.7
## [1] 8149763 8209172
r <- rast(ext(-320000, -150000, 8140000, 8250000), # define extent
resolution = 100, crs = "epsg:2962") # define res and CRS
land_rast <- rasterize(land, r) # rasterise land layer to defined raster
plot(land_rast)land_dir <- direction(land_rast) # calculate direction of each water cell to nearest land cell
plot(land_dir)# clockwise relative to vertical will have to invert and rotate
# invert and rotate clockwise by pi/2
land_dir <- -1*land_dir+pi/2
# use ifelse for SpatRasters to add 2*pi to cells less than -pi
land_dir <- ifel(land_dir <= -pi, land_dir+2*pi, land_dir)
plot(land_dir)# convert track to spatVector
tracks_gps_crw_vect <- tracks_gps_crw_wgs84 %>%
as.data.frame() %>%
vect(geom = c("x", "y"), crs = "epsg:2962")
# extract dir to nearest land cell
tracks_gps_crw_wgs84$land_dir <- extract(land_dir, tracks_gps_crw_vect)[,"layer"]Specifically, when in the travelling state, the narwhal turning angle
is biased toward \(\pm90^/circ\)
relative to the direction to the nearest coastline. The main challenge
in implementing this model is that there are two ways to move parallel
relative to the coastline: up the coastline and down the coastline. For
a given angular covariate, momentuHMM allows us to model
angular bias toward/away one angle (i.e., either \(+90^/circ\) or \(-90^/circ\)). To include two possible
directions, we have to use a trick where we divide the one behaviour
into two states with their own respective biases. In this case, we will
divide the travelling state into travelling left
("travel_L") and travelling right ("travel_R")
of the direction to the nearest coastline.
# define model
stateNames <- c("resident", "travel_L", "travel_R")
nbState <- length(stateNames)
# define distributions for each data stream
dist <- list(step = "gamma", angle = "vm")Prep data
# prep data
prep_gps_crw_wgs84 <- tracks_gps_crw_wgs84 %>%
as.data.frame() %>%
prepData(type = "UTM", angleCovs = "land_dir") ## Warning in prepData.default(., type = "UTM", angleCovs = "land_dir"): There
## are 27 missing covariate values. Each will be replaced by the closest available
## value.
The warning message is because some locations were so close to land
in reality, they fell on a cell defined as land and where there is no
applicable “direction to nearest land cell”. It may also occur if the
interpolated locations occur on land. In cases with missing covariate
data, momentuHMM will replace the NA with the closest
available value, which should be good enough for our purposes.
As the location data is the same as the previous section, we can pull its step and angle to void large data gaps.
Next, we must define the pseudo-DM, where travel_L is
biased toward land_dir + pi/2 and travel_R is
biased toward land_dir - pi/2. In formula notation, when we
want to execute a mathematical operation as written, we must use
Rs functionI(seehelp(“I”)for details). Second, because“travel_L”and“travel_R”`
represent the same biological behaviour which should have the same
parameters, we can include their arguments in the same columns in the
pseudo-DM.
# define turning angle pseudo-DM
# b1 b2 b3
angleDM <- matrix(c(0, 0, 0,# mean resident
"I(land_dir + 3.14/2)",0, 0,# mean travel_L
"I(land_dir - 3.14/2)",0, 0,# mean travel_R
0, 1, 0, # k resident
0, 0, 1, # k travel_L
0, 0, 1),# k travel_R
byrow = T, nrow = nbState*2,
dimnames = list(c("mean_R","mean_TL","mean_TR", "k_R","k_TL","k_TR"),
c("mean_T:land","k_R","k_T")))In the above pseudo-DM, the coefficients "mean_T:land"
and "k_T" will be shared between "travel_L"
and "travel_R" as the rows are identical. To remove
additional parameters, we will also specify the pseudo-DM for the step
length with shared parameters for the travel states.
# define turning angle pseudo-DM
# b1 b2 b3 b4
stepDM <- matrix(c(1, 0, 0, 0,# mean resident
0, 1, 0, 0,# mean travel_L
0, 1, 0, 0,# mean travel_R
0, 0, 1, 0, # sd resident
0, 0, 0, 1, # sd travel_L
0, 0, 0, 1),# sd travel_R
byrow = T, nrow = nbState*2,
dimnames = list(c("mean_R","mean_TL","mean_TR", "sd_R","sd_TL","sd_TR"),
c("mean_R","mean_T", "sd_R","sd_T")))
# combine DMs
DM <- list(step = stepDM, angle = angleDM)We can also share the transition probabilities to and from the travel
states using the beta constraints (betaCons) argument in
the fitHMM function. betaCons is a matrix
representing the off-diagonal elements of the state transition
probability matrix (i.e., not including the probabilities of remaining
in the same state). The values of the matrix must be integers
identifying any equality constraints (identical integers), wherein the
integer must represent the lowest index of constrained values (see
betaCons details in help("fitHMM")).
# To: R,TL,TR
betaCons <- matrix(c( 1, 1, # from R
3, 4, # from TL
3, 4 ), # from TR
nrow = 1, byrow = T)In the above code, we left the diagonal values (remaining in the same state) empty for visual clarity. The two 1s convey that the transition from R to either TL or TR is the same. No index of “2” (from R to TR) is used as it is constrained to index “1”. The 3s convey that the probabilities of transitioning from TL to R or TR to R are the same. Last, the 4s convey that the probability of transitioning from TL to TR is the same as from TR to TL.
Now we can specify the starting parameters and fit the HMM.
set.seed(1)
# define starting values
mu0 <- log(c(200, 500)) # SL mean (500 shared by TL and TR)
sigma0 <- log(c(150, 300)) # SL sd (300 shared by TL and TR)
angle0 <- 5 # TA mean
kappa0 <- log(c(0.1, 1)) # TA kappa (1 shared by TL and TR)
Par0 <- list(step = c(mu0, sigma0), angle = c(angle0,kappa0))
# Fit HMM
HMM_AngleLand <- fitHMM(prep_gps_crw_wgs84, nbState = nbState, dist = dist, Par0 = Par0, DM = DM, stateNames = stateNames, estAngleMean = list(angle = T), circularAngleMean = list(angle = T), betaCons = betaCons)## =======================================================================
## Fitting HMM with 3 states and 2 data streams
## -----------------------------------------------------------------------
## step ~ gamma(mean: custom, sd: custom)
## angle ~ vm(mean: custom, concentration: custom)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## Decoding state sequence... DONE
## Decoding state sequence... DONE
# show parameter estimates and confidence intervals
HMM_AngleLand$CIbeta$angle[c("est", "lower", "upper")] %>% # get estimate and CI
lapply(as.vector) %>% # convert to vectors
do.call(cbind, .) %>% # cbind into data.frame
data.frame(names = colnames(HMM_AngleLand$mle$angle), .) %>% # add beta names
kable(digits = 3) # output table| names | est | lower | upper |
|---|---|---|---|
| mean_T:land | 0.162 | 0.129 | 0.195 |
| k_R | 0.819 | 0.746 | 0.892 |
| k_T | 3.077 | 2.958 | 3.195 |
Given the confidence intervals to not include 0, it appears that at
least some of narwhal orientation during travel is governed by the
direction of the coastline. We can map these states with the coastline
using tmap. For clarity, we will filter one individual.
# identify most likely state using the Viterbi algorithm
prep_gps_crw_wgs84$state <- viterbi(HMM_AngleLand)
# convert to SF object
prep_gps_crw_sf_sub <- prep_gps_crw_wgs84 %>%
filter(ID == "T172066") %>%
st_as_sf(coords = c("x", "y")) %>% # converts to an sf object
st_set_crs(2962)
# plot with tmap
tm_shape(prep_gps_crw_sf_sub) +
tm_dots(col = "state", palette = c("brown", "dodgerblue", "darkblue"), size = 0.1) +
tm_shape(st_as_sf(land)) +
tm_polygons()
#### Bonus If the animal is presumed to move at some angle other than
\(0^\circ\), \(180^\circ\), or \(\pm90^\circ\) that is not known a priori,
we can allow the model to estimate the angle of bias by defining animal
movement as a trade-off between directional persistence, bias in the
same direction as the stimulus, and bias perpendicular to the stimulus.
For example, if we thought narwhals moved at some unknown angle relative
to the direction of the tides, the pseudo-DM could be formatted as
follows:
# define turning angle pseudo-DM
# b1 b2 b3 b4
angleDM <- matrix(c(0, 0, 0, 0,# mean resident
"tide", "I(tide + 3.14/2)",0, 0,# mean travel_L
"tide", "I(tide - 3.14/2)",0, 0,# mean travel_R
0, 0, 1, 0, # k resident
0, 0, 0, 1, # k travel_L
0, 0, 0, 1),# k travel_R
byrow = T, nrow = nbState*2,
dimnames = list(c("mean_R","mean_TL","mean_TR", "k_R","k_TL","k_TR"), c("mean_T:tide","mean_T:tide90","k_R","k_T")))After the model is fit and the beta coefficients are estimated, the
angle of attraction can be calculated using atan2(b2, b1).
See Togunov
et al. (2021) and Togunov
et al. (2022) for details.
Argos data section - for now removed. We could do an extra tutorial.
–>
–>
–>
–>
–>
–>
–>
–>
–>
–>